TCGA Gene Expression data (IlluminaHiSeq_RNASeqV2)TCGA Protein Expression data (RPPA)TCGA Copy number variation (CNV) Affymetrix SNP Array 6.0TCGA DNA methylation Illumina Human Methylation 450TCGA Mutation Mutation Annotation Format (MAF)TCGA microRNA Expression Illumina HiSeqTCGA Clinical DataGTEx gene expression dataIHEC gene expression dataGEO gene expression datasetRecently, The Cancer Genome Atlas (TCGA’s) Pan-Cancer Atlas initiative presented a comprehensive collection of 27 studies covering 11,000 patient tumors from 33 cancer types. These studies investigated cancer complexity from different angles by integrating multi -omics and clinical data. In particular, computational analyses have led to the identification of 299 cancer-driver genes and over 3,400 driver mutations. However, it still remains critical to clarify the consequences of each alteration and the underlying biological effects. We will discuss several computational tools that are useful in clarifying gene functions when performing integrative analysis of multi-omics datasets. In order to deal with the challenges of data retrieval and integration, TCGAbiolinks (Colaprico et al. 2018, PMID: 26704973) and DeepBlueR (Albrecht et al., 2017, PMID: 28334349) were developed to retrieve data from TCGA, CPTAC, GTEx, GEO and IHEC, Blueprint, ENCODE, Roadmap, respectively. Tumor-specific cancer-driver-gene events and downstream impact can be elucidated with MoonlightR by integrating these datasets (Colaprico et al. 2018). TCGAbiolinks and MoolightR have been used successfully in multiple studies for oncogenic processes identification (Ding et al., 2018, PMID: 29625049), oncogenic clinically actionable driver genes discovery (Bailey et al., 2018, PMID: 29625053), and comprehensive immune landscape characterization (Thorsson et al., 2018, PMID: 29628290)
In this workshop we will demonstrate the capability of TCGAbiolinks, deepblueR and MoonlightR for (i) integrating multi -omics data from different consortium, (ii) to reproduce the six immune subtypes from TCGA PanCancer and (ii) how features (Immune Subtypes, Oncogenic Processes, Driver Genes and Stemness) can be used by the end user to expand their understating of their own un-published data. The workshop is organized in 4 subsections: 1. Retrieve datasets from (TCGA, GTEx, GEO and IHEC) 2. Discover cancer driver genes 3. Identify immune subtypes 4. Stemness scores
You can easily query - download - prepare multi -omics data from GDC: 1. Gene expression 2. Copy number 3. Protein expression (RRPA) 4. Methylation 5. Clinical data 6. microRNA expression 7. Mutation 8. Chromatin Accessibility
TCGA Gene Expression data (IlluminaHiSeq_RNASeqV2)The NCI’s Genomic Data Commons (GDC) provides the cancer research community with a unified data repository that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Treatments (TARGET). [https://gdc.cancer.gov/]
The end user can search TCGA’s samples for a specific cancer of origin, download and prepare a matrix of gene expression.
DataDir <- "~/Dropbox (Personal)/Umiami/TCGAanalysis/GDCdata/TCGA-BRCA"
require(TCGAbiolinks)
cancerType <- "BRCA"
# Query platform Illumina HiSeq with a list of barcode
query <- GDCquery(project = paste0("TCGA-",cancerType),
data.category = "Gene expression",
data.type = "Gene expression quantification",
experimental.strategy = "RNA-Seq",
platform = "Illumina HiSeq",
file.type = "results",
legacy = TRUE)
# We select 10 tumor and 10 normal samples
Sample_sel <- query$results[[1]]$cases
# TP is Primary Tumor
# NT is Solid Tissue Normal
Sample_sel_TP <- TCGAquery_SampleTypes(barcode = Sample_sel,typesample = "TP")
Sample_sel_NT <- TCGAquery_SampleTypes(barcode = Sample_sel,typesample = "NT")
Sample_sel_short <- c(Sample_sel_TP[1:10],Sample_sel_NT[1:10])
# we need to create a new query with the selected barcodes
query_down <- GDCquery(project = paste0("TCGA-",cancerType),
data.category = "Gene expression",
data.type = "Gene expression quantification",
experimental.strategy = "RNA-Seq",
platform = "Illumina HiSeq",
file.type = "results",
barcode = Sample_sel_short,
legacy = TRUE)
# Download a list of barcodes with platform IlluminaHiSeq_RNASeqV2
GDCdownload(query_down,directory = DataDir)
# Prepare expression matrix with geneID in the rows and samples (barcode) in the columns
# rsem.genes.results as values
BRCARnaseqSE <- GDCprepare(query_down,directory = DataDir)
#
# TCGAanalyze_Preprocessing perform Array Array Intensity correlation (AAIC). It defines a square symmetric matrix of spearman correlation among samples. According this matrix and boxplot of correlation samples by samples it is possible to find samples with low correlation that can be identified as possible outliers.
dataPrep <- TCGAanalyze_Preprocessing(BRCARnaseqSE,
cor.cut = 0.6,)
dataNorm <- TCGAanalyze_Normalization(tabDF = dataPrep,
geneInfo = geneInfoHT,
method = "gcContent")
pdf("TCGA_BRCA_dataPrep_Norm_boxplot.pdf")
par(mfrow=c(2,1))
boxplot(dataPrep, outline = FALSE)
boxplot(dataNorm, outline = FALSE)
dev.off()
The result is shown below:
#> Registered S3 methods overwritten by 'ggplot2':
#> method from
#> [.quosures rlang
#> c.quosures rlang
#> print.quosures rlang
#> Registered S3 method overwritten by 'R.oo':
#> method from
#> throw.default R.methodsS3
| TCGA-A7-A13G-11A-51R-A13Q-07 | TCGA-E9-A1NG-11A-52R-A14M-07 | |
|---|---|---|
| MFSD11|79157 | 1474.94 | 1018.77 |
| MPP3|4356 | 188.00 | 170.00 |
| CSRP2|1466 | 1289.00 | 913.00 |
| SLC7A3|84889 | 0.00 | 8.00 |
| SULF2|55959 | 3563.00 | 5546.00 |
| EFHD2|79180 | 848.00 | 2715.00 |
| LHX3|8022 | 0.00 | 0.00 |
| NTRK1|4914 | 4.00 | 4.00 |
| KDELR3|11015 | 1135.00 | 1663.00 |
| PRKDC|5591 | 10953.00 | 9567.00 |
The result from TCGAanalyze_Preprocessing and TCGAanalyze_Normalization is shown below:
TCGA Protein Expression data (RPPA)You can easily search TCGA samples, download and prepare a matrix of protein RPPA expression.
require(TCGAbiolinks)
cancerType <- "BRCA"
query.RPPA <- GDCquery(project = paste0("TCGA-",cancerType),
legacy = TRUE,
data.category = "Protein expression",
platform = "MDA_RPPA_Core",
data.type = "Protein expression quantification",
file.type = "expression",
sample.type = c("Primary solid Tumor"))
samples.RPPA <- query.RPPA$results[[1]]$cases
query.RPPA.down <- GDCquery(project = paste0("TCGA-",cancerType),
legacy = TRUE,
data.category = "Protein expression",
data.type = "Protein expression quantification",
platform = "MDA_RPPA_Core",
file.type = "expression",
sample.type = c("Primary solid Tumor"),
barcode = samples.RPPA)
GDCdownload(query.RPPA.down,
directory = PathDir)
data.RPPA <- GDCprepare(query.RPPA.down,
directory = PathDir)
TCGA Copy number variation (CNV) Affymetrix SNP Array 6.0You can easily search TCGA samples, download and prepare a matrix of CN segments.
require(TCGAbiolinks)
cancerType <- "BRCA"
query.cnv <- GDCquery(project = cancerType,
data.category = "Copy number variation",
legacy = TRUE,
data.type = "Copy number segmentation",
platform = "Affymetrix SNP Array 6.0",
file.type = "nocnv_hg19.seg",
sample.type = c("Primary solid Tumor"))
samples.cnv <- query.cnv$results[[1]]$cases[1:20]
query.cnv.down <- GDCquery(project = cancerType,
data.category = "Copy number variation",
legacy = TRUE,
platform = "Affymetrix SNP Array 6.0",
data.type = "Copy number segmentation",
file.type = "nocnv_hg19.seg",
sample.type = c("Primary solid Tumor"),
barcode = samples.cnv)
GDCdownload(query.cnv.down,
directory = PathDir)
data.cnv <- GDCprepare(query.cnv.down,
directory = PathDir)
data.cnv <- as.data.frame(data.cnv)
TCGA DNA methylation Illumina Human Methylation 450You can easily search TCGA samples, download and prepare a matrix of DNA methylation probes.
require(TCGAbiolinks)
cancerType <- "BRCA"
query.met <- GDCquery(project = cancerType,
legacy = TRUE,
data.category = "DNA methylation",
platform = "Illumina Human Methylation 450",
sample.type = c("Primary solid Tumor"))
samples.met <- query.met$results[[1]]$cases[1:20]
query.met.down <- GDCquery(project = cancerType,
legacy = TRUE,
data.category = "DNA methylation",
platform = "Illumina Human Methylation 450",
sample.type = c("Primary solid Tumor"),
barcode = samples.met)
GDCdownload(query.met.down,
directory = PathDir)
data.met <- GDCprepare(query.met.down,
directory = PathDir)
TCGA Mutation Mutation Annotation Format (MAF)You can easily search TCGA samples, download and prepare a matrix of Mutations.
require(TCGAbiolinks)
cancerType <- "TCGA-BRCA"
query.mut <- GDCquery(project = cancerType,
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
sample.type = c("Primary solid Tumor"),
legacy = TRUE)
# Check maf availables
query.mut$results[[1]]$file_name
query.mut <- GDCquery(project = cancerType,
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
sample.type = c("Primary solid Tumor"),
legacy = TRUE,
file.type = query.mut$results[[1]]$file_name[1])
samples.mut <- query.mut$results[[1]]$cases
query.mut.down <- GDCquery(project = cancerType,
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
sample.type = c("Primary solid Tumor"),
legacy = TRUE,
barcode = samples.mut,
file.type = query.mut$results[[1]]$file_name[1])
GDCdownload(query.mut.down,
directory = PathDir)
data.mut <- GDCprepare(query.mut.down,
directory = PathDir)
TCGA microRNA Expression Illumina HiSeqYou can easily search TCGA samples, download and prepare a matrix of microRNA Expression
require(TCGAbiolinks)
cancerType <- "TCGA-BRCA"
query.miR <- GDCquery(project = cancerType,
data.category = "Gene expression",
data.type = "miRNA gene quantification",
platform = "Illumina HiSeq",
file.type = "hg19.mirbase20.mirna.quantification",
legacy = TRUE)
samples.miR <- query.miR$results[[1]]$cases
query.miR.down <- GDCquery(project = cancerType,
data.category = "Gene expression",
data.type = "miRNA gene quantification",
platform = "Illumina HiSeq",
file.type = "hg19.mirbase20.mirna.quantification",
legacy = TRUE,
barcode = TNBCsamplesmiRlong)
GDCdownload(query.miR.down,
directory = PathDir)
data.miR <- GDCprepare(query.miR.down,
directory = PathDir)
TCGA Clinical DataYou can easily search TCGA samples, download and prepare a matrix of clinical data
require(TCGAbiolinks)
cancerType <- "BRCA"
dataClin <- GDCquery_clinic(project = paste0("TCGA-",cancerType),type = "clinical")
GTEx gene expression dataThe Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a comprehensive public resource to study tissue-specific gene expression and regulation. Samples were collected from 53 non-diseased tissue sites across nearly 1000 individuals, primarily for molecular assays including WGS, WES, and RNA-Seq. [https://gtexportal.org/home/] You can easily search GTEx samples, download and prepare a matrix of gene expression [RNA-Seq].
#Genotype-Tissue Expression (GTEx) project, which is another large-scale repository cataloging gene expression from healthy individuals.
# In the following example we will retrieve data from brain's tissue.
require(TCGAbiolinks)
data_gtex_brain <- TCGAquery_recount2("gtex", tissue = "brain")
Gtex_brain_annotation <- colData(data_gtex_brain$gtex_brain)
sort(table(Gtex_brain_annotation$smtsd))
data_gtex_brain_GeneExpr <- assay(data_gtex_brain$gtex_brain,1)
# data_gtex_brain_GeneExpr[1:5,1:5]
#SRR2166176 SRR2167642 SRR607445 SRR663753 SRR1485892
#ENSG00000000003.14 98669 122656 74728 56933 69009
#ENSG00000000005.5 764 1018 760 2105 151
#ENSG00000000419.12 1076085 1034842 56668 207245 124647
#ENSG00000000457.13 746248 752634 48502 121776 66762
#ENSG00000000460.16 603111 588176 25143 55928 34473
IHEC gene expression dataThe International Human Epigenome Consortium (IHEC) is a global consortium with the primary goal of providing free access to high-resolution reference human epigenome maps for normal and disease cell types to the research community. [http://ihec-epigenomes.org/]
The BLUEPRINT consortium has been formed with the aim to further the understanding of how genes are activated or repressed in both healthy and diseased human cells. BLUEPRINT focuses on distinct types of haematopoietic cells from healthy individuals and on their malignant leukaemic counterparts.
You can easily search IHEC BLUEPRINT samples, download and prepare a matrix of gene expression.
require(DeepBlueR)
require(dplyr)
# List all IHEC samples
IHEC_samples <- deepblue_list_samples()
#table(IHEC_samples$source)
# BLUEPRINT Epigenome Blueprint HSC differentiation BLUEPRINT Progenitors
# 1451 76 76
# CEEHRC ChIP-Atlas CREST
# 529 497 38
# DEEP DEEP (IHEC) ENCODE
# 194 39 3317
# ENCODE FTP Roadmap Epigenomics TEPIC reprocessed IHEC data
# 9 144 28
# List all BLUEPRINT samples
blueprint_samples <- deepblue_list_samples(
extra_metadata = list("source" = "BLUEPRINT Epigenome"))
# Extract their ids
blueprint_samples_ids <- deepblue_extract_ids(blueprint_samples)
# Select gene expression data. We assign gene names using Gencode 22
gene_exprs_query <- deepblue_select_expressions(sample_ids = blueprint_samples_ids,
expression_type = "gene",
gene_model = "gencode v22")
# We request the data and define the output format
request = deepblue_get_regions(query_id = gene_exprs_query,
"@GENE_ID(gencode v22),FPKM,@BIOSOURCE,@SAMPLE_ID")
# We download the data
gene_regions <- deepblue_download_request_data(request)
# We retain a table mapping sample ids to bisources
sample_names <- dplyr::select(gene_regions, `@BIOSOURCE`, `@SAMPLE_ID`) %>%
dplyr::distinct()
# We filter out duplicated gene entries
genes_one_sample <- dplyr::filter(gene_regions, `@SAMPLE_ID` == "s10678")
duplicated_genes <- genes_one_sample[
which(duplicated(genes_one_sample$`@GENE_ID(gencode v22)`)),
"@GENE_ID(gencode v22)"]
# We convert the gene expression from a list to a data frame and subsequently...
genes_matrix = dplyr::filter(gene_regions,
!(`@GENE_ID(gencode v22)` %in% duplicated_genes)) %>%
dplyr::select(-`@BIOSOURCE`) %>%
tidyr::spread(key = `@SAMPLE_ID`, value = FPKM)
# ...to a numeric matrix
genes <- genes_matrix[,1]
genes_matrix <- data.matrix(genes_matrix[,-1])
rownames(genes_matrix) <- genes
### OUTPUT
### genes_matrix : The gene expression matrix for all 276 BLUEPRINT samples
### sample_names : A mapping table from sample id to cell type / biosource
IHEC_genes_matrix <- genes_matrix
IHEC_sample_names <- sample_names
save(IHEC_genes_matrix, IHEC_sample_names, file = "IHEC_BLUEPRINT_geneExpr.Rdata")
#> IHEC_genes_matrix[1:5,1:5]
# s10271 s10272 s10275 s10276 s10279
#ENSG00000000003.13 0.01 0.03 2.39 0.70 0.05
#ENSG00000000005.5 0.00 0.00 0.00 0.00 0.00
#ENSG00000000419.11 30.04 17.49 16.36 5.80 11.99
#ENSG00000000457.12 3.29 2.63 2.70 0.78 1.91
#ENSG00000000460.15 3.16 2.06 1.36 0.49 1.65
#> head(IHEC_sample_names)
# @BIOSOURCE @SAMPLE_ID
#1: CD8-positive, alpha-beta T cell s10506
#2: germinal center B cell s10678
#3: common myeloid progenitor s10518
#4: adult endothelial progenitor cell s11176
#5: neutrophilic myelocyte s10331
#6: common lymphoid progenitor s10486
GEO gene expression datasetGEO Gene Expression Omnibus (GEO) is the public repository for high throughput data at NCBI. [https://www.ncbi.nlm.nih.gov/geo/] Here we show how to retrieve GEOs’ samples, download and prepare a matrix of gene expression. In particular to retrieve the gene expression matrix from Nazor et al. (2012, PMID: 22560082).
require(MoonlightR)
require(GEOquery)
library(devtools)
install_github("ibsquare/MoonlightR")
dataNazor <- getDataGEO(GEOobject = "GSE30652",
platform = "GPL6947")
require(SummarizedExperiment)
GSE30652 <- as.data.frame(exprs(dataNazor))
GSE30652_non_norm <- cbind(ILMN = rownames(GSE30652),
IDmean = rowMeans(GSE30652),
GSE30652)
dataNazor_samples <- pData(dataNazor)
dataNazor_samples <- as.data.frame(dataNazor_samples)
dataNazor_samples <- subset(dataNazor_samples,
select = c("geo_accession","characteristics_ch1.2"))
colnames(dataNazor_samples)[2] <- "CellType"
dataNazor_samples$CellType <- gsub("cell type: ","",dataNazor_samples$CellType)
dataNazor_samples$CellType <- gsub(", undifferentiated","",dataNazor_samples$CellType)
GPL6947_13512 <- fData(dataNazor)
GPL6947_13512_annot <- as.data.frame(GPL6947_13512)
GPL6947_13512_annot <- subset(GPL6947_13512_annot,
select = c("ID","Gene.symbol"))
GSE30652_merge <- merge(x = GPL6947_13512_annot,
y = GSE30652_non_norm,
by.x = "ID",
by.y = "ILMN")
GSE30652_merge <- GSE30652_merge[order(GSE30652_merge$IDmean,decreasing = TRUE),]
GSE30652_merge <- GSE30652_merge[!duplicated(GSE30652_merge$Gene.symbol),]
NazorMatrix <- GSE30652_merge
rownames(NazorMatrix) <- NazorMatrix$Gene.symbol
NazorMatrix <- NazorMatrix[,dataNazor_samples$geo_accession]
save(NazorMatrix,dataNazor_samples, file = "Nazor_GSE30652.Rdata")
In order to make light of cancer development, it is crucial to understand which genes play a role in the mechanisms linked to this disease and moreover which role that is. Commonly biological processes such as proliferation and apoptosis have been linked to cancer progression. Based on expression data we perform functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer. We then use these scores to predict two specific roles: genes that act as tumor suppressor genes (TSGs) and genes that act as oncogenes (OCGs). This methodology not only allows us to identify genes with dual role (TSG in one cancer type and OCG in another) but also to elucidate the underlying biological processes.
We propose MoonlightR a new approach to define TSGs and OCGs based on functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer.
Moonlight’s pipeline is shown below::
DPA: Differential Phenotype AnalysisDifferential phenotype analysis is able to identify genes or probes that are significantly different between two phenotypes such as normal vs. tumor, or normal vs. stageI, normal vs. molecular subtype.
For gene expression data, DPA is running a differential expression analysis (DEA) to identify differentially expressed genes (DEGs) using the TCGAanalyze_DEA function from .
For methylation data, DPA is running a differentially methylated regions analysis (DMR) to identify differentially methylated CpG sites using the TCGAanalyze_DMR the TCGAanalyze_DMR function from .
require(MoonlightR)
dataDEGs <- DPA(dataFilt = dataBRCA,
dataType = "Gene expression",
fdr.cut = 0.05,
logFC.cut = 0)
We can visualize those differentially expressed genes (DEGs) with a volcano plot using the TCGAVisualize_volcano function from .
library(TCGAbiolinks)
require(MoonlightR)
#> Loading required package: MoonlightR
#> Loading required package: doParallel
#> Loading required package: foreach
#> Loading required package: iterators
#>
#> Setting options('download.file.method.GEOquery'='auto')
#> Setting options('GEOquery.inmemory.gpl'=FALSE)
#> Registered S3 method overwritten by 'enrichplot':
#> method from
#> fortify.enrichResult DOSE
data(DEGsmatrix)
TCGAVisualize_volcano(DEGsmatrix$logFC, DEGsmatrix$FDR,
filename = "DEGs_volcano.png",
x.cut = 1,
y.cut = 0.01,
names = rownames(DEGsmatrix),
color = c("black","red","dodgerblue3"),
names.size = 2,
show.names = "highlighted",
highlight = c("SPINK4","INSL4"),
xlab = " Gene expression fold change (Log2)",
legend = "State",
title = "Volcano plot (Normal NT vs Tumor TP)",
width = 10)
The figure generated by the code above is shown below:
FEA: Functional Enrichment AnalysisThe user can perform a functional enrichment analysis using the function FEAcomplete. For each DEG in the gene set a z-score is calculated. This score indicates how the genes act in the gene set.
require(MoonlightR)
data(DEGsmatrix)
dataFEA <- FEA(DEGsmatrix = DEGsmatrix)
dataFEA_filt <- dataFEA[dataFEA$FDR < 0.01,]
dataFEA_filt <- dataFEA_filt[abs(dataFEA_filt$Moonlight.Z.score) > 1,]
The output can be visualized with a FEA plot (Functional Enrichment Analysis Plot)
The user can plot the result of a functional enrichment analysis using the function plotFEA. A negative z-score indicates that the process’ activity is decreased. A positive z-score indicates that the process’ activity is increased.
plotFEA(dataFEA = dataFEA_filt,
topBP = nrow(dataFEA_filt),
additionalFilename = "FEAplot.pdf",
height = 10,
width = 10)
The figure generated by the above code is shown below:
GRN: Gene Regulatory NetworkThe user can perform a gene regulatory network analysis using the function GRN which infers the network using the parmigene package.
dataGRN <- GRN(TFs = rownames(DEGsmatrix)[1:100],
normCounts = dataFilt,
nGenesPerm = 10,
kNearest = 3,
nBoot = 10)
URA: Upstream Regulator AnalysisThe user can perform upstream regulator analysis using the function URA. This function is applied to each DEG in the enriched gene set and its neighbors in the GRN.
data(dataGRN)
data(DEGsmatrix)
dataFEA <- FEA(DEGsmatrix = DEGsmatrix)
#BPselected <- dataFEA$Diseases.or.Functions.Annotation[1:5]
BPselected <- c("apoptosis","proliferation of cells")
dataURA <- URA(dataGRN = dataGRN,
DEGsmatrix = DEGsmatrix,
BPname = BPselected,
nCores=1)
PRA: Pattern Regognition AnalysisThe user can retrieve TSG/OCG candidates using either selected biological processes or a random forest classifier trained on known COSMIC OCGs/TSGs.
data(dataURA)
dataDual <- PRA(dataURA = dataURA,
BPname = c("apoptosis","proliferation of cells"),
thres.role = 0)
# dataDual$TSG
# ACN9 ADCK5 AFAP1L1
#1.123390 1.126988 1.112131
# dataDual$OCG
# ABCA3 ABCG1 ABCG2 ABHD6 ABTB1 ACADL ACAN ACE2 ACP5 ACP6
#3.3825395 1.1088252 3.3062860 1.6521305 2.7699272 2.8953050 1.6175689 1.1016235 1.4461591 1.7583342
# The user can plot the result of a Upstream regulator analysis using the function `plotURA`.
TSGs_genes <- names(dataDual$TSG)
OCGs_genes <- names(dataDual$OCG)
plotURA(dataURA = dataURA[c(TSGs_genes, OCGs_genes),])
The figure generated by the above code is shown below:
Characterize intertumoral (between tumors) genomic epigenomic and transcriptomic heterogeneity in cancer tissue by identifying dual role genes. In this case study, we used the expression levels of genes in all the samples obtained from TCGA - GDC with IlluminaHiSeq RNASeqV2 in 18 normal tissues and 18 cancer tissues according to the TCGA criteria.We applied the complete Moonlight pipeline (FEA-URA-PRA) to extract those genes that were significantly increasing or decreasing biological processes such as proliferation of cells and apoptosis. Figure showed the results of the moonlight pipeline that can be visualized with a circos plot using the function plotCircos from MoonlightR. For further examples see the MoonlightR’s vignette
http://bioconductor.org/packages/release/bioc/html/MoonlightR.html
The figure mentioned above is shown below:
Thorsson et al. performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes—wound healing, IFN-dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-dominant—characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis.
In this section we generate the Immune subtypes for TCGA and GEO tumor samples. Figure1C is from Thorsson et al. (2018, PMID: 29628290), to summarize features of six different immune subtypes.
In the following example we will run a Kaplan-Meier Overall Survival Analysis for TCGA KIRC samples breaking by Immune Subtypes
download.file(url = "https://ars.els-cdn.com/content/image/1-s2.0-S1074761318301213-mmc2.xlsx",
destfile = "X1_s2_0_S1074761318301213_mmc2")
X1_s2_0_S1074761318301213_mmc2 <- read_excel("X1_s2_0_S1074761318301213_mmc2")
X1_s2_0_S1074761318301213_mmc2 <- as.data.frame(X1_s2_0_S1074761318301213_mmc2)
CancerType <- c("KIRC")
ImmuneSubtypes <- as.data.frame(X1_s2_0_S1074761318301213_mmc2)
ImmuneSubtypes <- ImmuneSubtypes[ImmuneSubtypes$`TCGA Study` %in% CancerType,]
ImmuneSubtypes <- ImmuneSubtypes[ImmuneSubtypes$`Immune Subtype`!="NA",]
dataClin_KIRC <- GDCquery_clinic(project = "TCGA-KIRC",type = "clinical")
dataClin_merged <- merge(x = dataClin_KIRC,
y = ImmuneSubtypes,
by.x = "submitter_id",
by.y = "TCGA Participant Barcode")
p <- TCGAanalyze_survival(dataClin_merged,
clusterCol = "Immune Subtype",
conf.int = FALSE,
main = "TCGA KIRC Immune Subtypes",
height = 10,
width=10,
risk.table = TRUE,
filename = NULL)
p <- p$plot
p <- p + theme(legend.position = "right")
ggsave(p , filename = "TCGA_KIRC_ImmuneSubtypes_survival.png",width = 10,height = 6)
The result from the Immune subtypes -survival association for TCGA KIRC samples is shown below:
require(MoonlightR)
dataGEO_KIRC<- getDataGEO(GEOobject = "GSE15641",
platform = "GPL96")
require(SummarizedExperiment)
GSE15641 <- as.data.frame(exprs(dataGEO_KIRC))
GSE15641_non_norm <- cbind(ILMN = rownames(GSE15641),
IDmean = rowMeans(GSE15641),
GSE15641)
GSE15641_annot <- fData(dataGEO_KIRC)
GSE15641_annot <- as.data.frame(GSE15641_annot)
GSE15641_annot <- subset(GSE15641_annot,
select = c("ID","Gene.symbol"))
dataGEO_samples <- pData(dataGEO_KIRC)
dataGEO_samples <- as.data.frame(dataGEO_samples)
dataGEO_samples <- subset(dataGEO_samples,
select = c("geo_accession","source_name_ch1"))
colnames(dataGEO_samples)[2] <- "CellType"
dataGEO_samples <- dataGEO_samples[dataGEO_samples$CellType %in% "cancerous human kidney tissue, clear cell RCC",]
GSE15641_merge <- merge(x = GSE15641_annot,
y = GSE15641_non_norm,
by.x = "ID",
by.y = "ILMN")
GSE15641_merge <- GSE15641_merge[order(GSE15641_merge$IDmean,decreasing = TRUE),]
GSE15641_merge <- GSE15641_merge[!duplicated(GSE15641_merge$Gene.symbol),]
GSE15641_Matrix <- GSE15641_merge
rownames(GSE15641_Matrix) <- GSE15641_Matrix$Gene.symbol
GSE15641_Matrix <- GSE15641_Matrix[,dataGEO_samples$geo_accession]
save(GSE15641_Matrix, file = "Jones_GSE15641.Rdata")
# https://github.com/torongs82/Data/blob/master/GEO/Jones_GSE15641.Rdata
load("Jones_GSE15641.Rdata")
load("ImmuneMW.rda") # from data folder
dataImmuneSubtype <- TCGAanalyze_ImmuneSubtypes(ImmuneMW = ImmuneMW,
dataGE = GSE15641_Matrix)
Stemness, defined as the potential for self-renewal and differentiation from the cell of origin, was originally attributed to normal stem cells that possess the ability to give rise to all cell types in the adult organism. Cancer progression involves gradual loss of a differentiated phenotype and acquisition of progenitor-like, stem-cell-like features. Undifferentiated primary tumors are more likely to result in cancer cell spread to distant organs, causing disease progression and poor prognosis, particularly because metastases are usually resistant to available therapies
We defined signatures to quantify stemness using publicly available molecular profiles from normal cell types that exhibit various degrees of stemness.
By multi-platform analyses of their transcriptome, methylome, and transcription-factor binding sites using an innovative one-class logistic regression (OCLR) machine-learning algorithm (Sokolov et al., 2016), we obtained two independent stemness indices. One (mDNAsi) was reflective of epigenetic features; the other (mRNAsi) was reflective of gene expression.
We derived stemness indices using an OCLR algorithm trained on stem cell (ESC, embryonic stemcell; iPSC, induced pluripotent stem cell) classes and their differ-entiated ecto-, meso-, and endoderm progenitors, from NHLBI Progenitor Cell Biology Consortium (PCBC) dataset.
The Overall methodology for Development and Validation of the Stemness Indices from Malta et al. (2018, PMID: 29625051) is shown below:
Stemness Highlights:
In the following subsection we will replicate the methodology followed in the Stemness’s paper. (i) validating the stemness signature trained on PCBC’s dataset in two external dataset (Nazor, IHEC) (ii) Derivation of the stemness score for TCGA BRCA and LGG-GBM samples, with association to molecular subtypes and survival outcomes.
# Firstly we have previously prepared the Gene Expression matrix (genes in rows , samples in columns) # see previous section or download from the following github link
# "https://github.com/torongs82/Data/blob/master/GEO/Nazor_GSE30652.Rdata"
load("./Nazor_GSE30652.Rdata")
require(TCGAbiolinks)
TCGA_mRNA_StemScoreTable <- TCGAanalyze_Stemness(stemSig = PCBC_stemSig,
dataGE = NazorMatrix)
tab_mRNASi <- TCGA_mRNA_StemScoreTable
tab_mRNASi_merged <- merge(x = tab_mRNASi,
y = dataNazor_samples,
by.x = "Sample",
by.y = "geo_accession")
require(ggplot2)
require(ggpubr)
colnames(tab_mRNASi_merged)[3] <- "mRNASi"
tab_mRNASi_merged[tab_mRNASi_merged$CellType %in% "embryonic stem cell","CellType"] <- "ES"
tab_mRNASi_merged[tab_mRNASi_merged$CellType %in% "induced pluripotent stem cell","CellType"] <- "IPS"
tab_mRNASi_merged[tab_mRNASi_merged$CellType %in% "parthenogentic embryonic stem cell","CellType"] <- "Parthenogentic"
tab_mRNASi_merged[tab_mRNASi_merged$CellType %in% "Somatic.Primary","CellType"] <- "Primary"
tab_mRNASi_merged[tab_mRNASi_merged$CellType %in% "Somatic.Tissue","CellType"] <- "Tissue"
tab_mRNASi_merged <- tab_mRNASi_merged[order(tab_mRNASi_merged$mRNASi, decreasing = FALSE),]
library(forcats)
p <- ggplot(data=tab_mRNASi_merged, aes(x=Sample, y=mRNASi, fill = CellType)) +
geom_bar(stat="identity")+
scale_colour_gradient2()+
#coord_flip()+
# ylim(0, 15)+
#scale_x_discrete(limits = df1$Tissue)+
theme_classic() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())+
scale_fill_manual("legend", values = c("Primary" = "orange",
"Parthenogentic" = "blue",
"Tissue" = "darkgreen",
"ES" = "red",
"IPS" = "pink"))
p <- p + theme(legend.position="bottom")
p <- p + aes(x = fct_inorder(Sample))
ggsave(p, file = "Validation_mRNASi_Nazor.png", width = 6,height = 6)
The result from TCGAanalyze_Stemness validation in Nazor’s Dataset is shown below:
# Firstly we have previously prepared the IHEC Gene Expression matrix (genes in rows , samples in columns) # see previous section or download from the following github link
# https://github.com/torongs82/Data/blob/master/IHEC/IHEC_BLUEPRINT_geneExpr.Rdata
load("./IHEC_BLUEPRINT_geneExpr.Rdata")
require(TCGAbiolinks)
require(stringr)
require("biomaRt")
IHECMatrix <- IHEC_genes_matrix
IHEC_annot <- IHEC_sample_names
colnames(IHEC_annot) <- c("Diffname_short","X1")
IHEC_annot <- as.data.frame(IHEC_annot)
ensemblsIDS <- str_split_fixed(rownames(IHECMatrix),"\\.",2)[,1]
rownames(IHECMatrix)<- ensemblsIDS
IHEC_annot_common <- IHEC_annot[IHEC_annot$X1 %in% colnames(IHECMatrix),]
# mapping Ensembl IDs to gene Symbol
require(TCGAbiolinks)
library(clusterProfiler)
library(org.Hs.eg.db)
gene.df <- bitr(ensemblsIDS, fromType = "ENSEMBL",
toType = c( "ENTREZID", "SYMBOL"),
OrgDb = org.Hs.eg.db)
IHECMatrix_sel <- IHECMatrix[gene.df$ENSEMBL,]
rownames(IHECMatrix_sel) <-gene.df$SYMBOL
IHECMatrix_sel <- as.data.frame(IHECMatrix_sel)
IHECMatrix_sel_filt <- TCGAanalyze_Filtering(tabDF = IHECMatrix_sel,
method = "quantile",
qnt.cut = 0.25)
TCGA_mRNA_StemIHEC <- TCGAanalyze_Stemness(stemSig = PCBC_stemSig,
dataGE = IHECMatrix_sel_filt)
colnames(TCGA_mRNA_StemIHEC)[3] <- "mRNASi"
colnames(IHEC_annot)[1] <- "CellType"
colnames(IHEC_annot)[2] <- "Sample"
TCGA_mRNA_StemIHEC_merge<- merge(x = TCGA_mRNA_StemIHEC,
y = IHEC_annot,
by.x = "Sample",
by.y = "Sample")
tab_mRNASi_merged <- TCGA_mRNA_StemIHEC_merge
CellType <- table(tab_mRNASi_merged$CellType)
CellType_filt <- CellType[CellType > 5]
tab_mRNASi_merged_filt <- tab_mRNASi_merged[tab_mRNASi_merged$CellType %in% names(CellType_filt), ]
tab_mRNASi_merged <- tab_mRNASi_merged_filt
tab_mRNASi_merged <- tab_mRNASi_merged[order(tab_mRNASi_merged$mRNASi, decreasing = FALSE),]
require(ggplot2)
library(forcats)
p <- ggplot(data=tab_mRNASi_merged, aes(x=Sample, y=mRNASi, fill = CellType)) +
geom_bar(stat="identity")+
scale_colour_gradient2()+
#coord_flip()+
# ylim(0, 15)+
#scale_x_discrete(limits = df1$Tissue)+
theme_classic() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
p <- p + theme(legend.position="right")
p <- p + aes(x = fct_inorder(Sample))
p <- p + theme(text = element_text(size=14))
ggsave(p, file = "Validation_mRNASi_IHEC.png", width = 12,height = 6)
The result from TCGAanalyze_Stemness validation in IHEC’s Dataset is shown below:
# Firstly we load Gene Expression matrix (genes in rows , samples in columns)
# from a previous generated pancancer gene expression matrix
load("./dataFilt_panCancer33new.Rdata")
curCancer <- "BRCA"
# We have previously generated a table with 33 cancer types barcodes and molecular subtypes.
dataSubt_PanCancer <- PanCancerAtlas_subtypes()
# Selecting TCGA breast cancer
dataSubt_curCancer <- dataSubt_PanCancer[dataSubt_PanCancer$cancer.type %in% "BRCA",]
commonSamples <- intersect(dataSubt_curCancer$pan.samplesID, colnames(dataFilt))
dataFilt_curCancer <- dataFilt[,commonSamples]
load("./PCBC_stemSig.rda")
TCGA_mRNA_StemScoreTable <- TCGAanalyze_Stemness(stemSig = PCBC_stemSig,
dataGE = dataFilt_curCancer)
colnames(TCGA_mRNA_StemScoreTable)[1] <- "barcode"
sampleNT <- TCGAquery_SampleTypes(barcode = TCGA_mRNA_StemScoreTable$barcode,typesample = "NT")
sampleTP <- TCGAquery_SampleTypes(barcode = TCGA_mRNA_StemScoreTable$barcode,typesample = "TP")
#sampleTM <- TCGAquery_SampleTypes(barcode = TCGA_mRNA_StemScoreTable$barcode,typesample = "TM")
#sampleTAM <- TCGAquery_SampleTypes(barcode = TCGA_mRNA_StemScoreTable$barcode,typesample = "TAM")
#sampleTB <- TCGAquery_SampleTypes(barcode = TCGA_mRNA_StemScoreTable$barcode,typesample = "TB")
rownames(TCGA_mRNA_StemScoreTable) <- TCGA_mRNA_StemScoreTable$barcode
TCGA_mRNA_StemScoreTable[sampleNT,"SampleType"] <- "NT"
TCGA_mRNA_StemScoreTable[sampleTP,"SampleType"] <- "TP"
#TCGA_mRNA_StemScoreTable[sampleTM,"SampleType"] <- "TM"
#TCGA_mRNA_StemScoreTable[sampleTAM,"SampleType"] <- "TM"
#TCGA_mRNA_StemScoreTable[sampleTB,"SampleType"] <- "TB"
tab_mRNASi <- TCGA_mRNA_StemScoreTable
tab_mRNASi_merged <- merge(x = tab_mRNASi,
y = dataSubt_PanCancer,
by.x = "barcode",
by.y = "pan.samplesID")
require(ggplot2)
require(ggpubr)
colnames(tab_mRNASi_merged)[3] <- "mRNASi"
p<-ggplot(tab_mRNASi_merged, aes(x=SampleType, y=mRNASi, fill=SampleType))
p <- p + theme_classic()
p <- p + theme(legend.position="none")
p <- p + rotate_x_text(45)
p <- p + geom_jitter(shape=16, position=position_jitter(0.2), color = "black")
p <- p + geom_boxplot(position=position_dodge(1),outlier.colour = NA)
p <- p + theme(text = element_text(size=20))
ggsave(p , filename = "TCGA_BRCA_mRNASi_TP_NT.png",width = 5,height = 6)
tab_mRNASi_merged_TP <- tab_mRNASi_merged[tab_mRNASi_merged$SampleType %in% "TP",]
p<-ggplot(tab_mRNASi_merged_TP, aes(x=Subtype_Selected, y=mRNASi, fill=Subtype_Selected))
p <- p + theme_classic()
p <- p + theme(legend.position="none")
p <- p + rotate_x_text(45)
p <- p + geom_jitter(shape=16, position=position_jitter(0.2), color = "black")
p <- p + geom_boxplot(position=position_dodge(1),outlier.colour = NA)
p <- p + theme(text = element_text(size=20))
ggsave(p , filename = "TCGA_BRCA_mRNASi_subtypes.png",width = 5,height = 6)
The result from TCGAanalyze_Stemness is shown below:
The result from TCGAanalyze_Stemness with molecular subtypes is shown below:
# working with LGG and GBM
curCancer <- c("LGG","GBM")
# We have previously generated a table with 33 cancer types barcodes and molecular subtypes.
require(TCGAbiolinks)
dataSubt_PanCancer <- PanCancerAtlas_subtypes()
# Selecting TCGA breast cancer
dataSubt_curCancer <- dataSubt_PanCancer[dataSubt_PanCancer$cancer.type %in% curCancer,]
sampleCurCancer <- colnames(dataFilt)
sampleCurCancer <- sampleCurCancer[substr(sampleCurCancer,1,12) %in% dataSubt_curCancer$pan.samplesID]
dataFilt_curCancer <- dataFilt[,sampleCurCancer]
load("~/Dropbox (Personal)/Umiami/Github/TCGAbiolinks/data/PCBC_stemSig.rda")
TCGA_mRNA_StemScoreTable <- TCGAanalyze_Stemness(stemSig = PCBC_stemSig,
dataGE = dataFilt_curCancer)
colnames(TCGA_mRNA_StemScoreTable)[1] <- "barcode"
tab_mRNASi <- TCGA_mRNA_StemScoreTable
tab_mRNASi <- cbind(barcode12 = substr(tab_mRNASi$barcode,1,12),
tab_mRNASi)
tab_mRNASi$barcode12 <- as.character(tab_mRNASi$barcode12)
tab_mRNASi_merged <- merge(x = tab_mRNASi,
y = dataSubt_PanCancer,
by.x = "barcode12",
by.y = "pan.samplesID")
require(ggplot2)
require(ggpubr)
colnames(tab_mRNASi_merged)[4] <- "mRNASi"
p<-ggplot(tab_mRNASi_merged, aes(x=Subtype_Selected, y=mRNASi, fill=Subtype_Selected))
p <- p + theme_classic()
p <- p + theme(legend.position="none")
p <- p + rotate_x_text(45)
p <- p + geom_jitter(shape=16, position=position_jitter(0.2), color = "black")
p <- p + geom_boxplot(position=position_dodge(1),outlier.colour = NA)
p <- p + theme(text = element_text(size=20))
ggsave(p , filename = "TCGA_LGG_GBM_mRNASi_subtypes.png",width = 5,height = 6)
The result from TCGAanalyze_Stemness for LGG and GBM is shown below:
#working with survival glioma
tab_mRNASi_merged <- cbind(mRNASi_level = rep("mRNASi mod",nrow(tab_mRNASi_merged)),
tab_mRNASi_merged)
tabSi<- tab_mRNASi_merged
tabSi$mRNASi_level <- as.character(tabSi$mRNASi_level)
tabSi[tabSi$mRNASi < quantile(tabSi$mRNASi,1/3),"mRNASi_level"] <- "mRNASi Low"
tabSi[tabSi$mRNASi > quantile(tabSi$mRNASi,2/3),"mRNASi_level"] <- "mRNASi High"
require(TCGAbiolinks)
dataClin_LGG <- GDCquery_clinic(project = "TCGA-LGG",type = "clinical")
dataClin_GBM <- GDCquery_clinic(project = "TCGA-GBM",type = "clinical")
dataClin_LGG_GBM <- rbind(dataClin_LGG,dataClin_GBM)
dataClin_LGG_GBM <- dataClin_LGG_GBM[dataClin_LGG_GBM$submitter_id %in% tabSi$barcode12,]
dataClin_merged <- merge(x = dataClin_LGG_GBM,
y = tabSi,
by.x = "submitter_id",
by.y = "barcode12")
p <- TCGAanalyze_survival(dataClin_merged,
clusterCol = "mRNASi_level",
conf.int = FALSE,
main = "TCGA LGG GBM mRNASi",
height = 10,
width=10,
risk.table = TRUE,
filename = NULL)
p <- p$plot
p <- p + theme(legend.position = "bottom")
ggsave(p , filename = "TCGA_LGG_GBM_mRNASi_survival.png",width = 5,height = 6)
The result from the Stemness-survival association for LGG and GBM is shown below:
sessionInfo()
#> R version 3.6.0 (2019-04-26)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS High Sierra 10.13.5
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] grid parallel stats4 stats graphics grDevices utils
#> [8] datasets methods base
#>
#> other attached packages:
#> [1] MoonlightR_1.10.1 doParallel_1.0.14
#> [3] iterators_1.0.10 foreach_1.4.4
#> [5] png_0.1-7 TCGAbiolinks_2.13.1
#> [7] dplyr_0.8.1 SummarizedExperiment_1.14.0
#> [9] DelayedArray_0.10.0 BiocParallel_1.18.0
#> [11] matrixStats_0.54.0 Biobase_2.44.0
#> [13] GenomicRanges_1.36.0 GenomeInfoDb_1.20.0
#> [15] IRanges_2.18.0 S4Vectors_0.22.0
#> [17] BiocGenerics_0.30.0 knitr_1.23
#> [19] BiocStyle_2.12.0
#>
#> loaded via a namespace (and not attached):
#> [1] R.utils_2.8.0 tidyselect_0.2.5
#> [3] RSQLite_2.1.1 AnnotationDbi_1.46.0
#> [5] htmlwidgets_1.3 DESeq_1.36.0
#> [7] munsell_0.5.0 codetools_0.2-16
#> [9] miniUI_0.1.1.1 colorspace_1.4-1
#> [11] GOSemSim_2.8.0 highr_0.8
#> [13] DOSE_3.8.2 labeling_0.3
#> [15] urltools_1.7.3 GenomeInfoDbData_1.2.1
#> [17] hwriter_1.3.2 KMsurv_0.1-5
#> [19] polyclip_1.10-0 bit64_0.9-7
#> [21] farver_1.1.0 downloader_0.4
#> [23] generics_0.0.2 xfun_0.7
#> [25] ggthemes_4.2.0 randomForest_4.6-14
#> [27] EDASeq_2.18.0 R6_2.4.0
#> [29] clue_0.3-57 locfit_1.5-9.1
#> [31] manipulateWidget_0.10.0 gridGraphics_0.4-0
#> [33] bitops_1.0-6 fgsea_1.8.0
#> [35] assertthat_0.2.1 promises_1.0.1
#> [37] scales_1.0.0 ggraph_1.0.2
#> [39] enrichplot_1.2.0 gtable_0.3.0
#> [41] HiveR_0.3.42 sva_3.32.0
#> [43] rlang_0.3.4 genefilter_1.66.0
#> [45] cmprsk_2.2-7 GlobalOptions_0.1.0
#> [47] splines_3.6.0 rtracklayer_1.44.0
#> [49] lazyeval_0.2.2 GEOquery_2.50.5
#> [51] selectr_0.4-1 europepmc_0.3
#> [53] broom_0.5.2 BiocManager_1.30.4
#> [55] rgl_0.100.19 yaml_2.2.0
#> [57] reshape2_1.4.3 GenomicFeatures_1.36.0
#> [59] crosstalk_1.0.0 backports_1.1.4
#> [61] httpuv_1.5.1 qvalue_2.14.1
#> [63] clusterProfiler_3.10.1 tools_3.6.0
#> [65] tcltk_3.6.0 bookdown_0.10
#> [67] ggplotify_0.0.3 ggplot2_3.1.1
#> [69] gplots_3.0.1.1 RColorBrewer_1.1-2
#> [71] ggridges_0.5.1 Rcpp_1.0.1
#> [73] plyr_1.8.4 progress_1.2.2
#> [75] zlibbioc_1.30.0 purrr_0.3.2
#> [77] RCurl_1.95-4.12 prettyunits_1.0.2
#> [79] ggpubr_0.2 GetoptLong_0.1.7
#> [81] viridis_0.5.1 cowplot_0.9.4
#> [83] zoo_1.8-5 ggrepel_0.8.1
#> [85] cluster_2.0.8 magrittr_1.5
#> [87] data.table_1.12.2 DO.db_2.9
#> [89] circlize_0.4.6 triebeard_0.3.0
#> [91] survminer_0.4.3 aroma.light_3.14.0
#> [93] hms_0.4.2 mime_0.6
#> [95] evaluate_0.13 xtable_1.8-4
#> [97] XML_3.98-1.19 jpeg_0.1-8
#> [99] gridExtra_2.3 shape_1.4.4
#> [101] compiler_3.6.0 biomaRt_2.40.0
#> [103] tibble_2.1.1 KernSmooth_2.23-15
#> [105] crayon_1.3.4 R.oo_1.22.0
#> [107] htmltools_0.3.6 mgcv_1.8-28
#> [109] later_0.8.0 tidyr_0.8.3
#> [111] geneplotter_1.62.0 DBI_1.0.0
#> [113] tweenr_1.0.1 matlab_1.0.2
#> [115] ComplexHeatmap_2.0.0 MASS_7.3-51.4
#> [117] ShortRead_1.42.0 Matrix_1.2-17
#> [119] readr_1.3.1 parmigene_1.0.2
#> [121] gdata_2.18.0 R.methodsS3_1.7.1
#> [123] igraph_1.2.4.1 pkgconfig_2.0.2
#> [125] km.ci_0.5-2 rvcheck_0.1.3
#> [127] GenomicAlignments_1.20.0 RISmed_2.1.7
#> [129] xml2_1.2.0 annotate_1.62.0
#> [131] webshot_0.5.1 XVector_0.24.0
#> [133] rvest_0.3.4 stringr_1.4.0
#> [135] tkrgl_0.8 digest_0.6.18
#> [137] ConsensusClusterPlus_1.48.0 Biostrings_2.52.0
#> [139] rmarkdown_1.12 fastmatch_1.1-0
#> [141] survMisc_0.5.5 edgeR_3.26.1
#> [143] gtools_3.8.1 shiny_1.3.2
#> [145] Rsamtools_2.0.0 rjson_0.2.20
#> [147] nlme_3.1-139 jsonlite_1.6
#> [149] viridisLite_0.3.0 limma_3.40.0
#> [151] pillar_1.4.0 lattice_0.20-38
#> [153] httr_1.4.0 survival_2.44-1.1
#> [155] GO.db_3.7.0 glue_1.3.1
#> [157] UpSetR_1.3.3 bit_1.1-14
#> [159] ggforce_0.2.2 stringi_1.4.3
#> [161] blob_1.1.1 caTools_1.17.1.2
#> [163] latticeExtra_0.6-28 memoise_1.1.0